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1. INTRODUCTION 

Switchgear is one of the vital equipment in a power distribution network [1]. In power generating 
system, switchgears act as a mean to isolate and de-energize specific electrical components and buses to 
ensure safety of downstream maintenance work, such as faults clearing, routine maintenance, and equipment 
replacements. They are generally categorized by the insulating medium, such as air or oil, and are typically 
specified into low, medium, and high voltage classes [2]. It is important to keep a close monitoring on 
the condition and performance of operating switchgears. Diagnosis and corrective maintenance onto faulty 
switchgears should be prompt and immediate. A single incident can cause dire effects to the distribution 
network, operational staffs and thousands of end users, which will in turn cause major spikes on customer 
interruption statistics and regulatory perception [3, 4]. 

Failures on switchgears are usually caused by gradual degradation of the parts, such as insulators, 
switches and connectors [5]. At early stage, electrical faults such as corona, surface discharge and arcing can 
produce noises that are detectable in the frequency range (20 kHz to 100 kHz) of an ultrasonic detection 
system [6]. Such failures are not easily visible by naked eyes, but it is possible to identify the noise via 
ultrasonic detection systems [7]. The implementation of ultrasound detection systems can provide utility 
companies a new approach to ensure improved reliability and performance of critical electrical assets. 
To date, switchgear faults detection in Malaysia relies heavily on manual random inspection by qualified 
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technical experts [8]. An organized soft computing system with growing database can help to ensure a more 
systematic inspection routine. 

Switchgear failures can generally be categorized into several types, such as arcing, tracking, surface 
discharge and mechanical failure [9]. One of the commonly encountered faults is caused by Corona 
discharge [10]. Corona is the glow or electrical discharge around conductors [11]. Corona starts almost 
silently and occurs when the surrounding air is stressed beyond its ionization point without developing 
flashover [12]. The air between layers of insulation becomes charged when electrical stress exceeds 
the insulation value of the air. When the humidity and moisture in the air or gas exceeded certain values, 
Corona discharge occurs to form Ozone and Nitrogen Oxides [13]. These, when combined with the moisture 
will produce nitric acid, which is destructive to most dielectrics and certain metallic compositions, resulting 
in corrosion [14]. In addition, the high energy in some discharges result in mechanical, electrical and thermal 
damage. Corona will only occur when there are over 1000 Volts. It seeks a path to the ground [15]. 
Left uncorrected, Corona activity can be advancing to the surface discharge stage on the insulation board on 
a live part. The carbon deposits and light brown discoloration of the insulation board can then be possibly 
visible by naked eyes by maintenance personnel. Undetected corona can cause further deterioration to 
the insulator, which in turn leads to other failures such as surface discharge and eventually arcing [16]. 

This research aims to explore the implementation of soft computing and ultrasonic inspection 
systems to detect Corona discharge faults at their early stages. This can help the utility companies to take 
necessary corrective measure to prevent further failures, which can in turn lead to catastrophic losses. 
In this paper, a modified recognition algorithm enhanced with Extreme Learning Machine (ELM) mechanism 
is proposed for the detection of corona faults in a switchgear. The major contribution of this research is on 
the development and implementation of the ELM to identify Corona discharge via the sound waves 
generated. The layout of the paper is as follows: Chapter two discusses the modified ELM algorithm and its 
implementation in details. The experimental results and some related discussions are presented in chapter 
three. The final chapter offers a comprehensive conclusion on the research. 


2. EXTREME LEARNING MACHINE 

Extreme learning machine (ELM) is a competitive machine learning mechanism. It is simple in 
theory and fast in implementation. Literature study indicates that the ELM has significantly higher learning 
speed compared to that of a traditional feed-forward network learning algorithms while showing better 
generalization performance [17-20]. Based on empirical risk minimization theory, the learning process 
of the ELM requires only a single iteration. Unlike traditional learning algorithms, the ELM shows relatively 
smaller training error with smaller norm of weight [21]. This, in turn, leads to a better generalization 
performance [22]. To date, the ELM has shown good performance in regression applications as well as in 
large dataset classification applications [23-27]. This emerging learning mechanism is gaining popularity due 
to its robustness, controllability, fast learning rate, and good generalization performance. In this research, 
the Gaussian Mercer Classifier is incorporated in the ELM as the kernel decision making for the switchgear 
health condition. The flowchart of the algorithm is as illustrated in Figure 1. The switchgear health condition 
determination algorithm development in this research can be generally divided into three major phases, 
namely the training phase, the validation phase, and the prediction of new data phase. Several steps are 
required to complete each of the phases. 


2.1. Training phase 
The first step of this phase is on the training data collection. Data is collected by using Partial 
Discharge (PD) Detector. In the concept of ELM, all input data will be restored as shown in (1). 


Xy. Xn Xi y 
Xo, Xo. Xo 


X= () 
XyiXn2 Xwmsyyxu 
In this study, M is set to 10 000 while N is the number of data sample, yielding (2). 
X41 X12 --X110000 
X21 X22 --X210000 
X= : : . : (2) 


XniXwe2 a X 10000 'NxX10000 
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Obtain training and validation data, each with input and targeted output. Training data 
number is at least 2 times more than validation data. In the case that there are 295 
data, then 238 will be used for training, 42 will be used for validation and the 


remaining will be used for testing. 


Start by set the L to be a small value (e.g., 5) 


Validation phase 


Obtain training and validation accuracy rates. Repeat the training and validation 
procedures by increasing L until both accuracy rates are satisfactory high. Once this is 
found, save L and weights of classifier to be used in prediction of a new and 
unlabelled data. 











Increase L, e.g., +2 


Accuracy rates are satisfactory 
high? 


W 
Save L and weights for prediction of unlabelled data 


Figure 1. The Flowchart for Main Procedures of ELM Training and Validation 


The (3) shows the target output vector, T. 


ty 
ty 


The second step is on the initialization. The number of Activation Function, L, is defined to be a positive 
integer value. The type of activation function is defined with the common choice as the Gaussian Radial 
Basis Function (RBF) or Gaussian Sigmoid function. This is to determine a suitable activation function for 
the algorithm. An input weight matrix, a, is randomly assigned. Upon completion, the activation function is 
now computed in hidden layer. The Sigmoid function is represented in (4) and (5). 





1 
ays 1 en tex] +bd (4) 
H (5) 
1 4 4 e-(@1-411443.%21+45.%31 +54) 


The same activation formula is used for all nodes. Then, the hidden layer matrix is computed. The matrix and 
both Sigmoid and RBF functions are defined in (6) and (7) respectively. 


Ay 
f _|H2 
Hidden Layer =| , (6) 


Hy NXL 
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H, = et Mla} (7) 


Note that either Sigmoid or RBF function needs to be used. In this project, Sigmoid has been chosen in 
the programming of the classifier. 

In step 4, the output weights are calculated. Under ideal condition, it is assumed that Y=Hm, where 
m is the output weight matrix. The weight can be obtained by m=H™! Y. However, the inverse (inv function) 
matrix above cannot be solved since H is very likely to be a non-symmetry matrix.To solve this problem, 
a Moore-Penrose Pseudo inverse matrix (Pinv function) method is employed, yielding (8). 


m=(H"H)~*HTY (8) 


In step 5, the accuracy rate for training data is computed. Once B is computed, the same training 
data is used to calculate the accuracy rate. The output matrix Y [yy2 ... yn] 7 was computed based on H, 
as shown in (9). 


Y signum(HB) 


1 ifv>0 (9) 


Signum(v) = ts Sie 


Accuracy rate of training data can be calculated by 100% x number of training data that are correctly 
classified, divided by the number of training data (V). The final step is to save L, input and outputs weight 
(a, band B) for validation and prediction phases. 


2.2. Validation phase 
The first step in this phase is to validate the data. A sufficient number of training pairs with P 
validation samples are collected, each with an input vector and respective target output vector, as shown in (10). 


Wi1 Wiz ww. ~=Wim 
yal" 
Wer Wem yay 
dy 
D =| 


(10) 


dp PX1 


The second step is to load previous information. L, input and outputs weight (a band B) are loaded from 
training phase. In the third step, the hidden layer matrix is calculated. The hidden layer matrix and sigmoid 
function are defined in (11). 


ete 
14 eo (7 +bi) 
Ay (11) 
Hidden Layer = HM 
Hp PXL 


Step 4 computes the accuracy rate for validation data. The output matrix Y=[y1 y2 ... yp]’ was 


computed based on H, while Y=signum(HfB). The accuracy rate of validation data can be found by 
multiplying the number of validation data that correctly classified with 100%, divided by the number 
of validation data (P). 


2.3. Prediction of a new prediction data 

The first step in this phase is to load previous L, input and outputs weight (a, band B) from 
the training phase. It is followed by loading new input data, as shown in (12), where y is the classified 
switchgear health condition based on the input data. 


y = signum(h) (12) 
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Information and data from various switchgears were gathered in order to develop the algorithm 
to identify the corona discharge in the switchgear. Basically, repair information during maintenance 
and ultrasound data are acquired. The collected ultrasound data were segregated into 314 cases of normal 
case (no fault) and 228 cases of corona discharge. Noise from ultrasound data were removed before upload it 
as training data for the ELM model. Figure 2 shows an example of the wave pattern of the sound made 
during a Corona discharge. 


Time Domain 


Sample Data 
o 





-0.1 
-0.2 
0.3 
7600 7800 8000 8200 8400 8600 8800 9000 9200 
Time (x125 us) 


Figure 2. Corona discharge sound wave pattern sample 


3. RESULTS AND ANALYSIS 

Experiments are carried out to test the performance of the developed algorithm in identifying 
the Corona faults in the switchgears based on the sound waves generated when operating. The experiments 
are conducted in two different domains, namely in time domain and in frequency domain. The results 
obtained by the algorithm in training phase, validation phase and testing phase are shown and discussed. 

A total of 160 samples of data were used for ELM training, validation and testing in time domain 
analysis and experiment. The feature number is 10,000 and the hidden neuron number is 1,200. The corona 
time domain classifier categorizes all data instances of a test dataset as either positive or negative. 
This classification produces four outcomes - true positive, true negative, false positive and false negative. 
The classification or corona fault detection accuracy is calculated as the total number of two correct 
classifications (TP + TN) divided by the total number of a dataset (P + N), which is expressed and calculated 
with the equation as follows: 


; TP +TN nae 

= —______——__ X 

ce TP+TN+FEN + FP : 
_TP+TN 


x 1009 
P+N # 


The error rate (ERR) is calculated as the number of all incorrect classifications divided by the total number 
of the dataset by using the equation as follows: 


Hep EN anne 
TP+TN+FN+FP 
P+N 


Table 1 shows the output matrix for training phase in time domain. There are 128 sets of data used 
in the training phase; 26 cases of Corona and 90 cases of non-Corona are successfully identified; 3 cases 
of Corona are wrongly identified as non-Corona while 9 cases of non-Corona are wrongly identified as 
Corona. Overall, the accuracy is calculated to be at 90.63% with the error rate of 9.37%. 

In the validation phase of time domain analysis, 24 sets of data are used. Twenty cases 
of non-Corona are successfully identified. Overall, the accuracy is calculated to be at 87.5% while the error 
rate is at 12.5%. Table 2 shows the output matrix. Table 3 shows the output matrix for testing phase in time 
domain, in which 8 sets of data are used. The algorithm successfully identified 1 Corona fault and all 6 of the 
non-Corona cases. The testing marks an overall 87.5% accuracy with 12.5% error rate. Frequency domain 
analysis also employed 160 sets of data, in which 128 sets, 24 sets, and 8 sets are used for training, 
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validation, and testing phases respectively. Feature number is at 5,000 while hidden neuron number is at 150 
for frequency domain analysis. The calculations for the accuracy rates and error rates are the same as in time 
domain analysis. Table 4 shows the output matrix for training phase in frequency domain. There are 21 cases 
of Corona and 94 cases of non-Corona successfully identified; 3 cases of Corona are wrongly identified as 
non-Corona while 10 cases of non-Corona are wrongly identified as Corona. Overall, the accuracy 
is calculated to be at 89.84% with the error rate of 10.16%. 

In the validation phase of frequency domain analysis, 24 sets of data are used. There are 3 cases 
of Corona and 17 cases of non-Corona are successfully identified. Overall, the accuracy is calculated to be at 
83.33% while the error rate is at 17.67%. Table 5 shows the output matrix. Table 6 shows the output matrix 
for testing phase in time domain, in which 8 sets of data are used. The algorithm successfully identified both 
the Corona faults and 5 non-Corona cases. The testing marks an overall 87.5% accuracy with 12.5% 
error rate. 


Tablel. Output matrix for training phase: 
time domain corona fault classification 


Table 2. Output matrix for validation phase: 
time domain corona fault classification 








Identified to be Identified to be Identified to be Identified to be 
Corona non-Corona Corona non-Corona 
Actual Corona 26 3 Actual Corona 1 1 
Actual non-Corona 9 90 Actual non-Corona 2 20 





Table 3. Output matrix for testing phase: 
time domain corona fault classification 


Table 4. Output matrix for training phase: 
frequency domain corona fault classification 








Identified to be Identified to be Identified to be Identified to be 
Corona non-Corona Corona non-Corona 
Actual Corona 1 1 Actual Corona 21 3 
Actual non-Corona 0 6 Actual non-Corona 10 94 





Table 5. Output matrix for validation phase: 
frequency domain corona fault classification 


Table 6. Output matrix for testing phase: 
frequency domain corona fault classification 








Identified to be Identified to be Identified to be Identified to be 
Corona non-Corona Corona non-Corona 
Actual Corona 3 2 Actual Corona 3 2 
Actual non-Corona 2 17 Actual non-Corona 2 17 





4. CONCLUSION 

Switchgear is a component of high importance in a distribution network to ensure safety especially 
during downstream maintenance. Without proper monitoring and inspection, a switchgear can fail due 
to many types of faults. A robust fault identification system can be very useful to eliminate manual 
and random inspections. In this research, a sound-wave-based fault detection system is proposed 
with the implementation of Extreme Learning Machine (ELM). Experiments are carried out to investigate 
the performance of the developed algorithm in identifying Corona faults in switchgears. Analysis are carried 
out in time and frequency domain, respectively. In time domain analysis, the results show 90.63%, 87.5%, 
and 87.5% of success rates in differentiating the Corona and non-Corona cases in training, validation 
and testing phases respectively. In frequency domain analysis, the results show 89.84%, 83.33%, and 87.5% 
success rates in training, validation and testing phases respectively. It can thus be concluded that the 
developed algorithm performed well in identifying Corona faults in switchgears. With the development 
of the algorithm, the utility companies can have a standard analysis, which in turn grants a more accurate 
decision making to prioritize the urgency for the remedial works. In time to come, the research can be 
expended and implemented to identify other switchgear faults, and even other engineering 
categorization problems. 
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